Data Mining: Some examples and some directions

نویسندگان

  • Alfred Vella
  • Iain McLaren
  • Himanshu Gupta
چکیده

Many successful attempts have been made to deduce information from data. Such data might come from an explicit body of knowledge, for example from a data base. Alternatively the data might be implicit, for example the behaviour of a system. The system may itself be physical such as a helicopter or a factory, or it may be conceptual such as a financial system. The extraction of information from such data is increasingly being referred to as Data Mining or even as Data Base Mining even though finding patterns in data has been a preoccupation of scientists for thousands of years. Techniques from Artificial Intelligence such as Rule Induction, Artificial Neural Networks and Genetic Algorithms, as well as the traditional statistical clustering techniques can be used effectively on small to medium sized data sets. We illustrate the use of these techniques by giving examples of applications ranging from the prediction of strain on helicopter parts through demand forecasting to applications in shop floor scheduling and the prediction of foreign exchange rates. Despite the undoubted success of these methods, problems arise when the size of the data set becomes very large. These problems of speed and space that could previously be ignored when dealing with small data sets become more important and may even dominate once the set becomes large. Another important issue is the understandability of the model produced. Twentieth century humans seem to be much less able to accept successful but unexplained models than their ancestors. In this article we look at the developments of our ideas in Data Base Mining influenced by the rapid expansion in the quantity of data available to companies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Mining in R using Rattle

T‎his paper is a brief introduction to the concepts, methods ‎and ‎algorithms ‎for ‎data ‎mining ‎in ‎statistical ‎software R ‎using a‎ ‎package ‎named ‎Rattle. Rattle ‎provides a‎ ‎good ‎graphical ‎environment ‎to ‎perform ‎some ‎of ‎the ‎procedures ‎and ‎algorithms ‎without ‎the ‎need ‎for ‎programming. ‎Some ‎parts ‎of ‎the ‎package ‎will ‎be ‎explained ‎by a‎ ‎number ‎of ‎examples.‎ ‎ ...

متن کامل

Directions for E-science and Science 2.0 in Human and Social Sciences

In this review and tutorial article, new developments towards extended use of information and communications technologies in science are discussed. The focus is in human and social sciences, specifically in linguistics and economics. Some challenging epistemological issues are handled in detail including the subjective and intersubjective nature of human knowing and how it influences scientific...

متن کامل

Visual Data Mining Techniques

Never before in history has data been generated at such high volumes as it is today. Exploring and analyzing the vast volumes of data has become increasingly difficult. Information visualization and visual data mining can help to deal with the flood of information. The advantage of visual data exploration is that the user is directly involved in the data mining process. There are a large number...

متن کامل

A Survey of Graph Mining Techniques for Biological Datasets

Mining structured information has been the source of much research in the data mining community over the last decade. The field of bioinformatics has emerged as important application area in this context. Examples abound ranging from the analysis of protein interaction networks to the analysis of phylogenetic data. In this article we survey the principal results in the field examining them both...

متن کامل

Operations research and data mining

With the rapid growth of databases in many modern enterprises data mining has become an increasingly important approach for data analysis. The operations research community has contributed significantly to this field, especially through the formulation and solution of numerous data mining problems as optimization problems, and several operations research applications can also be addressed using...

متن کامل

College of Library and Information Services University of Maryland College Park, Md

This paper surveys techniques for applying data mining techniques to large text collections and illustrates how those techniques can be used to support the management of science and technology research. Specific issues that arise repeatedly in the conduct of research management are described, and a textual data mining architecture that extends a classic paradigm for knowledge discovery in datab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007